Grammar-based Compression of Unranked Trees
نویسندگان
چکیده
We introduce forest straight-line programs (FSLPs) as a compressed representation of unranked ordered node-labelled trees. FSLPs are based on the operations of forest algebra and generalize tree straight-line programs. We compare the succinctness of FSLPs with two other compression schemes for unranked trees: top dags and tree straight-line programs of first-child/next sibling encodings. Efficient translations between these formalisms are provided. Finally, we show that equality of unranked trees in the setting where certain symbols are associative or commutative can be tested in polynomial time. This generalizes previous results for testing isomorphism of compressed unordered ranked trees.
منابع مشابه
Dictionary-Based Tree Compression
Trees are a ubiquitous data structure in computer science. LISP, for instance, was designed to manipulate nested lists, that is, ordered unranked trees. Already at that time, DAGs were used to detect common subexpression, a process known as “hash consing.” In a DAG every distinct subtree is represented only once (but can be referenced many times) and hence it constitutes a dictionary-based comp...
متن کاملLogical Definability and Query Languages over Unranked Trees
Unranked trees, that is, trees with no restriction on the number of children of nodes, have recently attracted much attention, primarily as an abstraction of XML documents. In this paper, we study logical definability over unranked trees, as well as collections of unranked trees, that can be viewed as databases of XML documents. The traditional approach to definability is to view each tree as a...
متن کاملA Note on Recognizable Sets of Unranked and Unordered Trees
Recognizable sets of unranked, unordered trees have been introduced in Courcelle [C89] in a Myhill-Nerode [N58] style of inverse homomorphisms of suitable finite magmas. This is equivalent of being the the union of some congruence classes of a congruence of finite index. We will add to the well-known concept of regular tree grammars a handling of nodes labeled with ǫ. With this rather unconvent...
متن کاملEquivalences between Ranked and Unranked Weighted Tree Automata via Binarization
Encoding unranked trees to binary trees, henceforth called binarization, is an important method to deal with unranked trees. For each of three binarizations we show that weighted (ranked) tree automata together with the binarization are equivalent to weighted unranked tree automata; even in the probabilistic case. This allows to easily adapt training methods for weighted (ranked) tree automata ...
متن کاملFinite automata on unranked trees: extensions by arithmetical and equality constraints
The notion of unranked trees has attracted much interest in current research, especially due to their application as formal models of XML documents. In particular, several automata and logic formalisms on unranked trees have been considered (again) in the literature, and many results that had previously been shown for the ranked-tree setting have turned out to hold for the unranked-tree setting...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1802.05490 شماره
صفحات -
تاریخ انتشار 2018